An elastic-network based local molecular field analysis of zinc-finger proteins 
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We study two designed and one natural zinc-finger peptide each with the Cys2His2 (CCHH) type 
of metal binding motif. In the approach we have developed, we describe the role of the protein 
and solvent outside the Zn(II)-CCHH metal-residue cluster by a molecular field represented by 
generalized harmonic restraints. The strength of the field is adjusted to reproduce the binding 
energy distribution of the metal with the cluster obtained in a reference all-atom simulation with 
empirical potentials. The quadratic field allows us to investigate analytically the protein restraints 
on the binding site in terms of its eigenmodes. Examining these eigenmodes suggests, consistent 
with experimental observations, the importance of the first histidine (H) in the CCHH cluster 
in metal binding. Further, the eigenvalues corresponding to these modes also indicate that the 
designed proteins form a tighter complex with the metal. We find that the bulk protein and solvent 
response tends to destabilize metal-binding, emphasizing that the favorable energetics of metal- 
residue interaction is necessary to drive folding in this system. The representation of the bulk protein 
and solvent response by a local field allows us to perform Monte Carlo simulations of the metal- 
residue cluster using quantum-chemical approaches, here using a semi-empirical Hamiltonian. For 
configurations sampled from this simulation, we study the free energy of replacing Zn 2+ with Fe 2+ , 
Co 2+ , and Ni 2+ using density functional theory. The calculated selectivities are in fair agreement 
with experimental results. 



I. INTRODUCTION 

An important post-translational modification of pro- 
teins involves incorporating metal ions into the protein 
structure In many of these instances, metal-binding 
stabilizes the folded structure or helps fold a previously 
unstructured or partially structured peptide. The ver- 
satility of metals as stabilizers or modifiers of protein 
structure is principally due to the nature of their inter- 
actions with amino acid residues: substantially strong on 
a thermal energy scale and chemically intricate 

The prevalence of metalloproteins and the growing ap- 
preciation of metal-induced (mis)folding in diseases, for 
example, see Refs. makes obtaining a molecular 

level understanding of the role of metals in protein struc- 
ture and function of unquestionable importance. But 
the intricacies of metal-protein interactions makes this 
a formidable challenge to current theory and simulation 
approaches. In a step towards this larger challenge, here 
we address the role of material outside the first-shell of 
the metal in metal-binding and selectivity in a zinc-finger 
protein. 

A satisfactory description of metal-protein interactions 
requires quantum chemical calculations, and these calcu- 
lations, especially at a high-level of theory, are compu- 
tationally demanding. Hence quantum chemical calcula- 
tions are limited to a small group of residues surround- 
ing the metal ion. The effect of the remaining protein 
and solvent medium on the structure and dynamics of 
the metal-residue cluster is typically described in one of 
three ways (for example, see Refs. (u !) ): the medium 
is entirely ignored, effectively simulating the cluster in 
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vacuo; the medium is described as a continuum with 
a dielectric constant; and in the most sophisticated of 
methods, the medium is described using empirical force- 
fields. Among these alternatives, only the last method 
explicitly accounts for the role of the bulk in modulating 
the architecture and dynamics of the cluster. 

The approach we present is a rigorous reduction in the 
degrees of freedom of the system and has the aim of un- 
derstanding the role of the medium outside the defined 
metal-residue cluster: the cluster is described in atomic 
detail and its structure and dynamics are influenced by 
the medium whose effect is described by a local molec- 
ular field. The present approach is inspired by a recent 
development in the theory of liquids [I0j . The essential 
idea in that development is to describe the role of the 
medium external to a defined inner-shell [Io| around a 
solute by a molecular field whose strength is adjusted to 
satisfy suitable consistency conditions, such as the mean 
density of the inner-shell. 

Here we obtain the local molecular field by describing 
the bulk protein outside the cluster as an elastic medium 
[ll|-[l9| . This development allows us to separate the sys- 
tem Hamiltonian into that for the metal-residue cluster 
and the remainder. We then integrate out the bulk de- 
grees of freedom to obtain a molecular field acting on 
the cluster. The strength of the field is adjusted self- 
consistcntly such that the binding energy distribution of 
the metal with the local residues reproduces the binding 
energy distribution obtained in a simulation treating the 
bulk atomically as well. (We focus on the binding en- 
ergy distribution, as this is the most relevant quantity 
for understanding the thermodynamics of metal binding 
to the protein |20l - |23j .) In this initial study, the ref- 
erence all-atom simulation and the simulations to de- 
termine the strength of the molecular field are all per- 
formed using empirical forcefields. With the local molec- 
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ular field determined at the coarse level, we study the 
metal-residue cluster using higher level methods, here us- 
ing semi-empirical and density functional methods with 
large basis sets. 

We use the local molecular field approach to study 
Zn" + binding to a zinc finger domain. Zinc finger do- 
mains are widely distributed in cellular systems, most 
notably in the transcription factor assembly [24| - |26| . The 
isolated zinc-finger domain is partially unstructured in 
the absence of the metal and achieves the correct folded 
conformation only upon binding the metal (27l - [29| . In 
the folded state, Zn 2+ is coordinated tetrahedrally to n- 
cysteine and 4-n histidine residues where n can be 2, 3 
or 4. In the system we study, n is 2. 

The rest of the article is organized as follows. In Sec. ITU 
we derive the local molecular field. Section inil collects de- 
tails of the molecular simulations. Discussions and con- 
clusions follow results presented in Sec. IIV1 



II. THEORY 

Consider the metal-bound protein in a solvent medium. 
We denote the conformational degrees of the protein by 
X = (Xx,X 2 ), where X\ is the metal plus the neighbor- 
ing amino acid residues and X 2 is the remainder of the 
protein. The solvent degrees are given by X a . 

In the canonical ensemble, for a given protein coordi- 
nate X, we can formally integrate over the solvent de- 
grees of freedom and write the effective potential on the 
protein — the potential of mean force — as 

U(X 1 ,X 2 ; P) = U X (X X ) + U 12 (X 1 ,X 2 ) + U 2 (X 2 ) + r)(Xx,X, 

Ui(Xi), U\ 2 (Xi, X 2 ), and U 2 (X 2 ) are site-site, site-bulk, 
and bulk-bulk interactions and r){X\, X 2 ; (3) is the solvent 
response. (Note that, in principle, such a decomposition 
can always be made.) /? = 1/kftT, where T is the tem- 
perature and fee is the Boltzmann constant. We indicate 
the temperature dependence of 77, a thermally averaged 
quantity, to emphasize the distinction from the potentials 
Ui, U 2 , and U± 2 . 

In QM/MM approaches @-[! HI , U x is described quan- 
tum mechanically and U 2 and U\ 2 arc described using 
molecular mechanics. Here we approximate the latter 
two quantities together with the response of the solvent 
by generalized harmonic restraints around the equilib- 
rium structure of the protein. Our ansatz is 



attempts to describe the viscous damping of protein oscil- 
lations as harmonic fluctuations [l6j , but explicit solvent- 
binding site interactions are neglected. We anticipate 
that these long-range contributions can be described us- 
ing mean-field models (such as dielectric models) and 
that they will not contribute to discriminating between 
Zn 2+ and a competing metal bound at the site. 

Our plan is to obtain the quadratic Hamiltonian, H, 
using contact topology based potentials. Since these 
models are valid only up to a proportionality con- 
stant to construct a physical potential energy 
function, we introduce a coupling constant, a, that fixes 
the strength of the harmonic interaction. 

The matrix H, suppressing the dependence on /3 for 
simplicity, is expanded as 



n = 



T~L\ G 
G T H 2 



(4) 



Here T-L\ is a diagonal matrix: no two site-atoms couple 
through Hi as those interactions are explicitly described 
in IA\. (In this regard, the pres ent development differs 
from those presented earlier [ill [H, Hl|.) The matrix Q 
couples the site atoms to the bulk, and the matrix T-L 2 
couples the bulk atoms with each other. 

Under the approximations noted above, for the sys- 
tem modeled by U(X±, X 2 ; 0), the excess Hclmholtz free 
energy, ,A CX , is given by 



= J e -mxuW) dXld x 2 



(5) 



Since the bulk protein coordinates X 2 appear quadrati- 
;(76tally, following [31|, we integrate over X 2 and get 



(6) 



where 



%sitc — H\ — GH 2 l G T . (7) 



U(Xx,X 2 ;/3) a Ux(Xx) + [SXx,5X 2 ] aUifi) 



SXx 
SX 2 



where 



5Xi = Xi 



X itP for i 



1.2. 



(2) 



(3) 



Here, Xi }P for i = 1,2 are the reference coordinates for 
the binding site and the protein medium obtained from 
the three dimensional structure of the protein. The ap- 
proximation made in Eq. [5] is physically motivated and 



Thus, under the approximations noted above, the effec- 
tive potential for the site is given by 

U{Xx;p) = Ux(Xx) + SX^an site SXx 

= Ux{Xx) + 4> m {Xx\a), (8) 

where in addition to the site-site interaction potential, 
Ui(Xi), the effective potential contains a local molecu- 
lar field, <p m (Xx\a) — S X± aW s it c S <X X , that describes the 
effect of the bulk protein and solvent damping on the 
site atoms. (For notational simplicity, the temperature 
dependence on <f> m is not explicitly shown.) The field 
<f> m acts as a restraint that limits the deflections of the 
binding site away from some reference state. (A suitable 
reference state can be the PDB structure.) For a = 0, 
there is no coupling between the site and the bulk, and 
the model reduces to a metal-residue cluster in vacuum. 
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A. Selectivity of the binding site 



The selectivity of the zinc finger peptide for Zn 



2 + 



over 



another transition metal X 2+ (X 2+ 
Ni 2+ ) is determined by 



Co 2+ , Fe 2+ , and 



A M ex = [/4* 2+ -m&h-KS)-^' 
= A// X (S) - A^ ox (aq) 



ex 
X 2 + 



/4* 2 +K ac 



(9) 



where ^ ox (aq) is the change in hydration free energies 
and ^ CX (S) is the corresponding quantity in the protein. 
It is understood that a common reference state is used in 
defining fi^ 2 + in water and in the protein pol . [22| . 

Calculating A^ 0X (S) presents significant challenges, in- 
cluding the need to describe interactions quantum me- 
chanically together with sampling binding site conforma- 
tions. It is here that the reduction in degrees of freedom 
made possible by the effective potential (Eq. [5]) proves 
helpful. 

Following Eq.0 A/i GX (S) is given by 0, H H2 



/3A P -(S) = / -PAU^Xr) . -0A</>„ 



= (e 

(e 



/Zn 2 "t 



-PAUi(Xi) 



)zn 2 



(10) 



where t/ X 2+ (X 1 ; (3) is the effective potential (Eq. [5]) with 
X 2+ in the site, IaUi = t/i i x 2 + — ^i.Zi^+i A$> m = 
0m,x 2 + — 0mZn 2 +j and (. . .)zn 2 + indicates canonical av- 
eraging with Zn 2+ bound to the protein. In Eq. [TQl by 
ignoring A0 m we are assuming that the response of the 
material outside the defined metal-residue cluster is the 
same for both X 2+ and Zn 2+ and thus the contribution to 
the selectivity free energy arises solely from interactions 
within the binding site. (We comment on this approxi- 
mation below.) 



B. Coupling constant a 

The excess chemical potential, /x| x 
potential distribution theorem [23, [35 



is given by the 



/4 X 2+ = fc B Tlog 



P(e)de 



(11) 



where P{e) is the probability distribution of the binding 
energy of the metal ion with the surrounding material. 
Since Mzn 2 + ^ s ^ ne essential quantity characterizing the 
thermodynamics of Zn 2+ binding to the site, we choose 
a such that the energy distribution from the approxi- 
mate model (Eq. [5]) reproduces the binding energy dis- 
tribution of Zn 2+ with the site from all-atom molecular 
simulations. It is most economical computationally, if 
these simulations are based either on an empirical po- 
tential energy function or a QM/MM approach with a 
less expensive quantum mechanical model. (Here we use 
an empirical potential energy function.) Then with the 
molecular field based on a coarse energy function, one can 
refine the description of the site with high-level quantum 
mechanical calculations in the presence of the molecular 
field. 



To obtain a, we minimize the Kullback-Liebler di- 
vergence [HI between the distribution, P s (e), obtained 
with simulation using a coarse model, and the distribu- 
tion, P(e;a), obtained from simulations of the site with 
the effective potential (Eq. [8]). Due to the exponential 
weighting e^ e , the high-e tail of the energy distribution 
P(s) more sensitively determines the excess free energy 
in Eq. 1111 Hence we require simulations with the model 
Hamiltonian (Eq. [8]) to reproduce the high-e tail. Thus, 
we restrict our attention to energy values e > e, where 
the mean binding energy is s. a is then obtained by 
minimizing 



A (a) = log 



(12) 



III. METHODS 
A. Elastic network model 

We construct the quadratic Hamiltonan (H) based on 
the Gaussian network model described in Ref. [IH. The 
interaction energy is expanded around the equilibrium 
structure of the protein in a Taylor series and only terms 
to second order are retained. 

Denoting the equilibrium distance between particle i 
and j as , the deviated distance by dij , and the devi- 



ation by Xij (= di 



j), we have 



U(d ij )ttU(r ij ) + 



U"{n 3 ) v 

2 ^ 



2 ^ij^ij- 



(13) 



fi,is are the labels for Cartesian components x, y, and 
z. U"(rij) is the second derivative of the potential en- 
ergy function. For simplicity, we assume that the sec- 
ond derivative is equal for all the interactions and that 
only particles within 6 A of each other interact. This 
cutoff value follows earlier studies using elastic network 
models and other potential functions dependent on topo- 
logical contact maps [nl - [l6l Il9j . We do not include U\ 
type interactions in constructing % as those interactions 
are dealt in atomic detail. Further, the inversion of H2 
(Eq. [7]) is defined only within the space of eigenvectors 
with non-zero eigenvalues; the remaining six zero-modes 
correspond to rotations and translations and do not con- 
tribute to the potential energy. 



B. Molecular dynamics simulation 

We consider three different zinc finger proteins to eval- 
uate our model: the consensus peptide (CP1) ( 29]: PDB 
ID: 1MEY), the peptide (YTA) based on the human 
zinc finger protein 32 (PDB ID: 2 YTA) , and TF3, and 
the transcription factor IIIA (HU; PDB ID: 1TF3). Of 
these CP1 and YTA are designed peptides. Residues 
between 139 and 162 forming the zinc-finger domain in 
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FIG. 1. The zinc finger domain (pdb ID:2YTA) comprises 
the a helix and the /3 sheet. The binding site residues (Cys 
and His) are shown as stick figures. These residues and the 
metal (sphere) totaling 59 particles are described by the site 
coordinates X\ (Eqs.[T]and[8]) and are throughout represented 
in atomic detail. 



energy function that we have implemented in a collection 
of Fortran codes. To calculate the field contribution in 
Eq. [51 we find the deflection of a configuration from the 
protein reference structure, 8X\ = X\ — X\ p , and obtain 
<f> m (X\ ; a) form the known matrix H s i te using 

( j> m {X 1 ;f3)=6X 1 aH sitc SX 1 . 

For each a, the simulations consist of 15 x 10 6 sweeps of 
equilibration followed by 15 x 10 6 sweeps of production. 
Every sweep comprises three sets of moves: 1) displace- 
ment of each particle; 2) rigid-body displacement of each 
amino acid residue; and 3) rigid-body rotation of each 
amino acid residue. Each of these moves is performed 
with a probability of 0.5. Displacements are made along 
randomly chosen directions and rotations are made along 
each axis using randomly chosen angles. The standard 
Metropolis criterion is then used to accept the sweep. In 
the equilibration phase, the maximum displacements and 
rotation are adjusted such that the acceptance ratio of 
sweeps stabilizes at « 0.25. (These maximum values are 
retained without further adjustments in the production 
phase.) Configurations are stored every 500 sweeps for 
further analysis. 



YTA are used in the simulations; residues outside the 
zinc-finger domain are not considered. The zinc-atom 
is coordinated by two cysteine thiolates and two his- 
tidines. We use the CHARMM27 forcefield [H for all 
the amino acid residues, with the thiolate partial charges 
from Refs. [13, H|j]. Simulations are performed with 
NAMD2 [13. 

We use the dummy cation Zn 2+ model developed in 
Ref. ■ The key feature of this model is the presence 
of four dummy sites disposed tetrahedrally at a distance 
of 0.9 A from a central atom. The dummy sites have a 
partial charge of 0.5e but no size. The presence of dummy 
sites proves helpful in maintaining the four-coordinate 
state of the metal in extended simulations |40| . 

Each of the proteins CP1, YTA, and TF3 are sol- 
vated in TIP3P [4l|, l42j water molecules using the sol- 
vate module in VMD [43| . After an initial minimization, 
the systems are equilibrated for 1 ns at 298.15 K and 1 
bar followed by a 3 ns production phase. Configurations 
are saved every 0.1 ps for analysis. The temperature is 
maintained by a Langevin thermostat while pressure is 
maintained using a Langevin barostat [44| . The reference 
binding energy distribution P s (e) (Eq. [T2"|) was calculated 
using code developed in-house. 

C. Monte Carlo simulations 



The X\ coordinates describe the Zn 2+ ion and the 4 bind- 
ing site residues, in all 59 atoms (Figure QJ. We calcu- 
late Ui{X x ) (Eq. HJ} using the CHARMM j36| potential 



D. Monte Carlo simulations with semi-empirical 
potentials 

Once we have the local molecular field, we can, in prin- 
ciple, examine the site with Monte Carlo or molecular dy- 
namics at a high level of theory. However, even for 59 par- 
ticles, our initial attempts at using B3LYP/TZV(2d+p) 
in the Monte Carlo scheme proved intractable, as ex- 
cessively long times were deemed necessary for adequate 
convergence. For this reason, in this initial study, we 
resorted to Monte Carlo simulation of the binding site 
using the semi-empirical PM3 (45], [4^] model in the Gaus- 
sian09 package [47| . The simulations are performed with 
the molecular field and the optimal value of a determined 
with the coarse potential. The equilibration and produc- 
tion comprises 50 x 10 3 and 25 x 10 3 sweeps respectively. 
Hydrogen atoms are added on the fly to satisfy valencies 
of C and N atoms. In the equilibration phase, the accep- 
tance ratio is 0.25. Configurations are saved after every 
5 sweeps. 

To compute the free energy change (Eq. [10]) in replac- 
ing Zn 2+ with X 2+ (= Co 2+ , Fe 2+ , Ni 2+ ), we assume that 
the configurations generated with the PM3 hamiltonian 
well-represent the configurations that would be produced 
with the B3LYP/TZV(2d+p) method, and then take 
150 equally spaced configurations from the production 
phase of the PM3-based simulation. For each configu- 
ration, the energy change in exchanging Zn 2+ (Eq. [TU]) . 
AUi = U\ x2+ — U\ zn 2 + , is calculated at the unrestricted 
B3LYP/TZV(2d+p) level using GAMESS H|. Consis- 
tent with experiments, we model all metal complexes in 
their high-spin electronic configuration [49|, • 
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FIG. 2. A (Eq. I12[) as a function of the coupling constant a 
with (filled) and without (open) site-bulk coupling (see text). 
The minimum of A is at a = 1.4 ksT. 



To the test the effect of suppressing protein restraints, 
we also compute the free energy of replacing Zn 2+ with 
Co 2+ or Fe 2+ . 50 x 10 3 sweeps of equilibration and 
25 x 10 3 sweeps of production were performed with the 
PM3 hamiltonian and a = 0.0. We then take 80 well 
separated configurations from the production phase to 
obtain a qualitative estimate of the free energy change 
with energies obtained at the B3LYP/TZV(2d+p) level. 

For obtaining A/x ox (aq), the change in the excess free 
energy in bulk water (Eq. ^ , we borrow from our earlier 
results based on the primitive quasi-chemical approach 
[5l| . Geometry, thermal corrections, and long-range con- 
tributions to the free energy are obtained from the ear- 
lier study, but for consistency with the presence work, 
single point energy calculations are performed at the un- 
restricted B3LYP/TZV(2d+p) level. 



IV. RESULTS AND DISCUSSION 

Unless otherwise mentioned, we report the key results 
of our model extensively only for YTA. Similar agreement 
is observed for CP1 and TF3. The selective preference 
for Zn 2+ over competing Co 2+ , Ni 2 +and Fe 2+ has been 
experimentally measured for CP1 [231 an d serves as a 
helpful metric to assess the present approach in quanti- 
fying thermodynamics of ion binding to protein sites. 



A. Validation of the quadratic model 

Figure [5] shows A(a) (Eq. [12} between P s (s), the ref- 
erence binding energy distribution, and P(e; a) obtained 
using the effective potential. A (a) is a minimum at 
a = 1.4; this is the value of a chosen for all further 
investigations. To examine the role of site-bulk inter- 



FIG. 3. Distribution of binding energies of Zn 2+ with the 
site from the reference MD simulations (solid curve) and 
with simulations using the effective potential (Eq. [8} with 
(j> m {Xy,a = 0) (o) and with <f> m (Xi;a = 1.4) (A). 

actions in site-site interactions, we set Q = (Eq. [4|; 
that is, we suppress site-bulk interactions. Physically, 
this corresponds to independent springs on site particles, 
as opposed to a site of interconnected particles. As Fig- 
ure [2] illustrates, with no site-bulk interactions A(a) is 
close to zero for all non-zero a: the probability distri- 
bution for binding energies P(s; a) has only a minimal 
overlap with the target distribution P s (s). Thus, in ac- 
cordance with experiments indicating the importance of 
second-shell interactions in tuning metal binding [H, [29| 
in the zinc-finger protein, the interaction of the site with 
the bulk material (protein outside the site and the sol- 
vent) is important in tuning the binding energy of Zn + 
with the site. 

Figure [3] shows that the site in vacuum (a = 0) pro- 
duces a binding energy distribution that is substantially 
different from either the MD simulations or with the op- 
timal (a = 1.4) site-bulk coupling constant. In particu- 
lar, in the MD simulation and for the site with the local 
molecular field, ion-site interactions are shifted to more 
positive values than for the site in vacuum. Thus, on 
average, Zn 2+ is better bound to the cluster in vacuum 
than it is when bulk protein and solvent effects are con- 
sidered. Physically this implies that the role of the bulk 
protein and the solvent is to destabilize the binding of 
the ion with the site. Or in other words, the folded (3f3a 
conformation of the protein is stabilized by the introduc- 
tion of the metal. Note that experiments suggest that 
metal coordination is essential for the protein fold and 
the metal free apo-peptide is unstructured [2714291 ] , 

Figure [4] shows that distribution of the potential en- 
ergy of the site (excluding the Zn 2+ atom). The po- 
tential energy U S i tc includes contributions from bonding, 
angles, torsions, improper angles, and non-bonded terms 
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280 320 360 400 

U site (kcal/mol) 

FIG. 4. Distribution P(U s itc) of the potential energy U s it e 
of the site particles excluding the Zn 2+ atom. Symbols as in 
Figure [3] Observes that the site in vacuum is more stable 
than the site including the bulk protein and solvent response. 

within the site. The site in vacuum is stable by about 
6 kcal/mol (A(U s it e ) = —6 kcal/mol between the binding 
site in vacuum and MD simulations) than the one coupled 
to the protein and solvent, again emphasizing the role of 
the metal in stabilizing the protein fold. Note that the 
coupling constant, a, was chosen to match the binding 
energy profile, P{e) of the Zn 2+ ion. The agreement in 
the distribution, P(U s ;t e ), of the binding site energetics is 
an independent verification of our model. Further, this 
observation suggests that the ENM can be successfully 
parametrized with respect to either ion binding energet- 
ics or site energetics. 

As we have argued before [22|, HH , the conformations 
that correspond to low U s it e , that is conformations for 
which the site is less strained energetically, are also the 
ones for which the ion-site interactions are unfavorable 
for the ion: the less strained conformations of the site 
correspond to the high energy tail of the ion-site distri- 
bution. As Eq. [TT] emphasizes, these conformations are 
also the ones that sensitively influence the excess chem- 
ical potential of the ion in the site. In this context it is 
important to note that the local molecular field is able to 
capture both the low energy wing of P(U s i tc ) (Figure 0} 
and the high-energy wing of P{e) (Figure [3]). 



B. Structural characterization of the binding site 

In Table U we summarize the various structural param- 
eters that are relevant to the geometry of the residues 
around the Zn 2+ atom. We find that Zn 2+ is able to 
maintain a tetra-coordinate state throughout the course 
of the simulation: the sulphur and nitrogen atoms from 
the cystine and histidinc residues, respectively, are al- 



ways coordinated with the metal. 

The geometry of the binding site in vacuum (a = 0) 
and with the optimal site-bulk coupling in the effective 
potential (Eq. [5J agree reasonably well with MD simula- 
tions. The ZN-Zn 2+ -N angle is somewhat more expanded 
in the MD simulations and the ZS-Zn 2+ -S is somewhat 
more compressed relative to the site with either a = 
or a = 1.4. But the fluctuations are large and it is un- 
clear if these differences are significant. This observation 
suggests that in tuning the molecular molecular field, en- 
ergies may prove more sensitive than structural parame- 
ters. 



C. Investigation of the protein restraints 

It is known that the folding of the zinc finger peptide 
is coupled with metal binding [27l - l29j . Thus, residues 
which are crucial in maintaining the binding site are also 
expected to be important in the folding of the peptide. 
It has been suggested that the /3-sheet hosting the two 
cysteine residues is formed in the absence of Zn 2+ and 
the addition of the metal induces the folding of the a- 
helix [53 - l54| . The coordination of Zn 2+ with the histi- 
dine residues is thought to be essential towards folding of 
the a-helix and formation of the hydrophobic core. We 
next consider how analyzing the protein field can provide 
insights into some of these features. 

MD MC a = MC a = 1.4 

r s _ Zn 2+ 2.17 ± 0.03 2.16 ± 0.03 2.15 ± 0.03 

r N -z n 2+ 2.08 ± 0.04 2.1 ± 0.05 2.07 ± 0.03 

ZS-Zn 2+ -S 112.0 ± 4.5 120.2 ± 5.9 121.9 ± 4.2 

ZN-Zn 2+ -N 104.6 ± 4.6 102.2 ± 4.6 100.5 ± 3.2 

TABLE I. Bond lengths and angles characterizing the geom- 
etry of the site. All angles are in degrees and distances in 
Angstroms. MD, molecular dynamics; a = (vacuum) and 
a — 1.4 feT, correspond to Monte Carlo simulations with the 
effective potential (Eq. [8). Standard deviations of the quan- 
tities are noted. 

The restraints imposed by the protein on the bind- 
ing site, cf> m (Xi]a) in Eq. [5J are collective in nature and 
cannot be expressed as a sum of independent restraints 
over individual atoms of the binding site. The quadratic 
approximation decomposes the restraints as a sum over 
mutually orthogonal collective motions of the binding site 
atoms. These collective motions or vibrational modes 
are along the eigenvectors of the positive definite matrix, 
Hsitc in Eq. [8j describing the protein field, (j) m (Xi, ; a). 

We classify the vibrational modes as either concen- 
trated on a single residue or distributed on more than 
one residue. Since the eigenvector has contributions from 
each of the atom comprising the binding site, we consider 
the contribution to the norm of the eigenvector from each 
residue of the binding site to evaluate the behavior of the 
modes. We call the mode concentrated if the majority of 



7 




Cys Cys His 



Cys Cys His His 



Cys Cys His His 



FIG. 5. Vibrational modes for YTA, CP1, and TF3 arranged according to the residue of the binding site which contributes 
most to its magnitude. The eigenvalue corresponding to each mode in feT/ 'A is plotted for each of the four metal-binding 
residues. Red lines denote concentrated modes while blue lines denote distributed ones. Observe that the strongest vibrational 
modes are concentrated on the first cysteine and the first histidine of the CCHH cluster comprising the binding site. 



the contribution to the norm is found on one residue (We 
choose a threshold of 0.8 for delineating concentrated 
modes from all other modes. The qualitative insights 
below are independent of this threshold.) Following this 
procedure for all the modes, Fig. ?? shows that the most 
important protein restraints for three different zinc fin- 
ger domains are in fact concentrated on the first cysteine 
and the first histidine residue of the CCHH cluster com- 
prising the binding site. This implies that the first C 
and the first H play an important role in maintaining the 
architecture of the site. 

Nomura and Sugiura [55j conducted experiments on 
the Spi zinc finger domain where they mutated one 
residue at a time in the CCHH binding site to a glycine 
residue, eliminating the propensity of that particular 
residue to coordinate with the Zn 2+ center. The authors 
showed, through circular dichroism measurements, that 
the CCHG mutant of the protein is able to form an a- 
helix and a /3-sheet structure similar to that of the wild 
type protein, while no helical structure is formed with 
the CCGH mutant, implying the importance of the first 
histidine in maintaining the architecture of the binding 
site. Our analysis of concentrated modes is consistent 
with this experimental observation of the importance of 
the first histidine in the CCHH cluster. 

Since the /3-sheet hosting the first cysteine is formed 
prior to the introduction of the metal, analysis of the 
modes in the presence of the metal is not sufficient to infer 
the consequences of mutations in the cysteine residues. 
Our analysis does suggest that the Zn 2+ will be loosely 
held in the CCHH finger and experience larger fluctua- 
tions compared to the CGHH finger. Spectroscopic ex- 
periments to probe the dynamics of the binding site can 
be helpful in evaluating this suggestion. 

Fig. ?? shows that the eigenvalues A defining the 
strength of the restraints on the binding site are larger 
for the designed peptides CP1 and YTA than for TF3: 
deflections of the binding site from the equilibrium struc- 
ture require more energy for the designed peptides than 
for the natural TF3 peptide. Thus we expect CP1 and 
YTA to experience smaller deviations from the equilib- 



rium structure as compared to TF3. Since metal coor- 
dination has a stabilizing effect on the protein fold and 
since deformations from the equilibrium structure will 
also tend to destabilize Zn 2+ binding to the protein, we 
can induce that CP1 and YTA will experience a higher 
stabilization due to introduction of the metal as com- 
pared to TF3. It is interesting to note that CP1 was de- 
signed by aligning 131 natural zinc finger sequences and 
is found to have a greater affinity for zinc than the corre- 
sponding natural sequences [56[. The predicted greater 
stabilization of Zn 2+ -bound CP1 is in accordance with 
this experimental observation. 

Sequence specific effects on stabilization can also be 
inferred. In CP1 and YTA only two residues interleave 
the cysteine residues in contrast to four residues in the 
natural peptide. This shorter turn between the cysteine 
residues likely underlies the observed differences in bind- 
ing energetics, but a thorough exploration of how such 
sequence dependences influence binding is left for future 
investigations. The analysis of concentrated modes de- 
veloped here may prove helpful in this regard. 



D. Selectivity of the site for Zn 2+ over competing 
transition metals 

Table |TT] summarizes the predicted values for the free 
energy difference between Zn 2+ and competing divalent 
ions for the consensus peptide CP1. The order of selec- 
tivity Co 2+ < Ni 2+ < Fe 2+ relative to Zn 2+ is consistent 
with experiments [29[ . The magnitude of selectivity for 
Fe 2+ is also in good agreement with experiments. 

We first consider limitations in calculating A^ ox (aq) 
in affecting the predicted selectivity, A/i cx (Eq. [9]). 
The aqueous component of the selectivity free energy, 
A/x ex (aq), is only about 1% of the absolute hydration 
free energy of the metal ion (5lj ; thus small errors in the 
absolute hydration free energy can be amplified in taking 
differences. In the present study, we have used a prim- 
itive quasichemical estimate [51j for A^ ox (aq): the free 
energy of forming a metal ion- water cluster in vacuum is 
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Ion 


a 


A M CX (S) A^ 


(aq) A M CX 


(Calc.) 


A^ ox (Expt.) 


Co 2+ 


1.4 


6.3 


5.2 


1.1 


5.4 




0.0 


6.9 


5.2 


1.7 


5.4 


Fe 2 + 


1.4 


29.2 


20.4 


8.4 


7.8 




0.0 


16.9 


20.4 


-3.5 


7.8 


Ni 2 + 


1.4 


-7.7 


12.6 


4.3 


7.4 



TABLE II. Free energy to replace Zn 2+ Co 2+ , Fe 2+ , and 
Ni 2+ (Eq. H| in the CP1 peptide. All values are in kcal/mol. 
A/i ex (S) is calculated using Eg. 1101 Calculations for a cluster 
in vacuum (a = 0) and for the cluster in a molecular field 
(a = 1.4fc B T) are reported for Co 2+ and Fe 2+ . Aj/ X (aq) was 
obtained using primitive quasi-chemical theory [51|]. (Please 
refer to Sec. IIII Dl for details.) Experimental estimates from 
Ref. [2!| are provided for comparison. The calculated selec- 
tivities for Co 2+ , Fe 2+ , and Ni 2+ , respectively, in the YTA 
peptide are 1.3, 8.1, and 6.7, similar to the estimate for CP1. 
We are not aware of experiments characterizing the selectivity 
in YTA. 



combined with a continuum dielectric correction for the 
presence of the bulk material. Ignoring the role of the 
bulk material in forming the metal- water cluster J57j and 
potential limitations in a continuum dielectric description 
of water beyond the first shell for highly charged metals 
[58l . [HH are important limitations in our calculation of 
A/z ex (aq). 

Limitations in calculating A^ ex (S) is the other factor 
in affecting the predicted selectivity. Examining the dis- 
tribution of AUi (Eq. [Trj|) . the difference in the bind- 
ing energy of metal X 2+ with the binding site relative 
to the corresponding value for Zn 2+ , shows that AU± is 
narrowly distributed about the mean value (AUi) (Ta- 
ble IIII[) , although the distribution of the binding energy 
of the metal with the cluster itself is broad (Fig. [3]). Thus 
the calculated free energy differences in the presence of 
the field are expected to be reasonably well-converged. 

Table [TT| shows that the protein field plays a decisive 
role in selectivity. The protein restraints limit the phase 
space distribution of the binding site to configurations 
which bind to Zn 2+ strongly. Removing these restraints 
allows the binding site to sample configurations which 
bind favorably to Fe 2+ as well. We find that the free 
energy for replacing Zn 2+ with Fc 2+ without the protein 
field (a = 0) predicts greater stabilization of the zinc- 
finger in the presence of Fe 2+ , in complete disagreement 
with experiments. The role of the protein in replacing 
Zn 2+ with Co 2+ is predicted to be negligible. Exam- 
ining the distribution of AU\ shows that the protein 
tends to limit energy fluctuations (C2 term, Table IIII[) 
and the effect is more pronounced for metal ions that in- 
teract strongly (relative to Zn 2+ ) with the zinc-binding 
residues, as the comparison of Fe 2+ and Co 2+ illustrates 
(cf. C\ and C2 Table HIT]) . Taken together, these obser- 
vations show that while the local metal-residue interac- 
tion is important in selectivity, as was already inferred 
in the early studies on zinc-fingers (49j. the role of the 
protein cannot be ignored, a result that is in accordance 



with experimental investigations comparing two different 
zinc-binding proteins [fjfjj . 

An important limitation in the present study is the 
small number of configurations sampled at the semi- 
empirical level. The small magnitude of C2 relative to 
C\ suggests that this limitation may not be severe in the 
case when the protein field is present. In any case, with 
a more efficient implementation of the present approach, 
this limitation can always be overcome. Regardless of 
these computational limitations, the important physical 
point we wish to emphasize is that accounting for the 
protein field is necessary for a broader characterization of 
the thermodynamics of metal-ion binding in this system. 
Thus, for example, the protein field will be necessary for 
a proper description of excess entropies and excess ener- 
gies, quantities that demand a proper description of the 
role of the medium on the local ion-residue cluster 1231 . 



Ion 


Q 


Am cx (S) 


Ci 


C*2 Cl 


-C*2 


Co 2 + 


1.4 


6.3 


6.8 


0.5 


6.3 




0.0 


6.9 


7.0 


0.9 


6.7 


Fe 2 + 


1.4 


29.2 


29.2 


0.5 


28.7 




0.0 


16.9 


29.8 


37.0 


-7.2 


Ni 2 + 


1.4 


-7.7 


-7.1 


1.3 


-8.4 



TABLE III. Comparing the exponential average (Eq. [9| with 
the Gaussian model A/j ex (S) = Ci — C%, where Ci = (AZ7i) 
and C 2 = /3/2((AUi - Ci) 2 ). Ci and 2C 2 //3 are the first 
(mean) and second (variance) moments of the distribution of 
AUi values [&|. Calculations for a cluster in vacuum (a — 0) 
and for the cluster in a molecular field (a = 1.4 ksT) are 
reported for Co 2+ and Fe 2+ . All values in kcal/mol. 



Changes in the equilibrium geometry of the protein 
can be expected with ion exchange owing to change in 
the ion size, which, within the quadratic approximation, 
will change the harmonic restraints on the binding site 
and thus the estimate of the selectivity free energy. Here 
A/x ex (S) was calculated with the assumption that the 
molecular field imposed by the protein medium and the 
solvent is independent of the bound ion, A<fi w 0. For the 
given coordination environment, the radii for metals con- 
sidered here are expected to be similar and so assuming 
A(f> « appears reasonable. This assumption will cer- 
tainly not hold when a metal associates with the binding 
site in a geometry substantially different from the one for 
Zn 2+ as may happen for Hg 2+ 



61 



V. CONCLUDING DISCUSSIONS 

The architecture and conformation of the metal bind- 
ing site in metalloproteins is conditioned by energetic in- 
teractions of the metal cation and the binding site on 
the one hand and the interaction of the binding site with 
the protein and solvent media outside the binding site on 
the other. In the case of the zinc finger peptide, reflecting 
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the fact that the peptide is unfolded in the absence of the 
metal, calculations show that including the interaction of 
the site with the protein material outside the binding site 
results in a more destabilized metal-binding site complex 
in comparison to an analogous metal-binding site com- 
plex in vacuum. 

The protein outside the metal-residue cluster imposes 
a field on the cluster. By treating the bulk protein as 
an elastic medium, we describe the molecular field by 
quadratic model. This simplifies the problem of tackling 
metal protein interactions by greatly reducing the num- 
ber of degrees of freedom and allows us to study the effect 
of the protein medium on the binding site in a trans- 
parent fashion. Decomposing the protein restraints on 
the binding site over mutually orthogonal collective mo- 
tions of the binding site particles shows that the protein 
restraints are stronger for designed zinc-finger peptides 
CP1 and YTA relative to the natural zinc finger TF3. 
Further, for all three zinc finger peptides, we find that 
the first cysteine and first histidine in the CCHH bind- 
ing site are more tightly held by the protein as compared 
to the other two residues of the binding site. Our re- 
sults are in accordance with the importance of the first 
histidine observed on the basis of metal-induced folding 
experiments with a CCGH mutant binding site. We also 
predict that were a peptide with the GCHH binding motif 
to fold in the presence of Zn 2+ , that metal-residue clus- 
ter is expected to experience large fluctuations relative 
to the CCHH binding motif. 

Approximating the solvent and the protein response 
by a quadratic term implies that we have neglected spe- 
cific interactions of solvent molecules on the dynamics 
of the binding site. Care would be required when the 
present model is applied to ion binding sites where water 
molecules play a crucial role. The molecular field ap- 
proach developed here requires the equilibrium structure 
of the protein in order to estimate the response of the 
bulk protein due to the binding site. If this response is 
not sensitively dependent on the bound ion, the molecu- 
lar field obtained on the basis of a protein structure for 
one ion can be used to predict the free energy change in 
replacing that ion with a competing one. Such calcula- 
tions on the zinc-finger peptide give reasonable estimates 
of the free energy of replacing Zn 2+ with Co 2+ , Fe 2+ , and 
Ni 2 +. 
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CHARMM [H, and additional data for CP1 and TF3 
peptides 
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